Skip to content

perf: bulk-apply parser-supplied per-finding tags during import#14701

Merged
Maffooch merged 7 commits intoDefectDojo:devfrom
valentijnscholten:perf/bulk-add-tags-from-parser
Apr 20, 2026
Merged

perf: bulk-apply parser-supplied per-finding tags during import#14701
Maffooch merged 7 commits intoDefectDojo:devfrom
valentijnscholten:perf/bulk-add-tags-from-parser

Conversation

@valentijnscholten
Copy link
Copy Markdown
Member

@valentijnscholten valentijnscholten commented Apr 15, 2026

Tags are accumulated per batch and applied just before the post_process_findings_batch task is dispatched, so deduplication and rules tasks see the tags already written to the DB.

Both default_importer and default_reimporter use the same approach. For the reimporter, finding_post_processing accepts an optional tag_accumulator list; when supplied, tags are accumulated rather than applied inline (backward-compatible for any direct callers).

These queries are not covered (yet) by the performance test that we have as the Stackhawk parser doesn't set tags.

finding.tags.add() per finding calls tagulous's add() which does:
  - reload() → SELECT current tags (1 query)
  - _ensure_tags_in_db() → get_or_create per tag (T queries)
  - super().add() → INSERT through-table rows (1 query)
  - tag.increment() → UPDATE count per tag (T queries)

For N findings with T parser-supplied tags: O(N·T) queries.

Replace with bulk_apply_parser_tags() in tag_utils, which groups
findings by tag name and calls bulk_add_tags_to_instances() once per
unique tag: O(unique_tags) queries regardless of N.

Tags are accumulated per batch and applied just before the
post_process_findings_batch task is dispatched, so deduplication and
rules tasks see the tags already written to the DB.

Both default_importer and default_reimporter use the same approach.
For the reimporter, finding_post_processing accepts an optional
tag_accumulator list; when supplied, tags are accumulated rather than
applied inline (backward-compatible for any direct callers).
Copy link
Copy Markdown
Contributor

@mtesauro mtesauro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

@Maffooch Maffooch added this to the 2.58.0 milestone Apr 17, 2026
@Maffooch Maffooch requested review from Jino-T and blakeaowens April 17, 2026 15:02
@Maffooch Maffooch merged commit 0349f01 into DefectDojo:dev Apr 20, 2026
157 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants